Semantic and phonetic automatic reconstruction of medical dictations
نویسندگان
چکیده
Automatic speech recognition (ASR) has become a valuable tool in large document production environments like medical dictation. While manual post-processing is still needed for correcting speech recognition errors and for creating documents which adhere to various stylistic and formatting conventions, a large part of the document production process is carried out by the ASR system. For improving the quality of the system output, knowledge about the multi-layered relationship between the dictated texts and the final documents is required. Thus, typical speech-recognition errors can be avoided, and proper style and formatting can be anticipated in the ASR part of the document production process. Yet – while vast amounts of recognition results and manually edited final reports are constantly being produced – the error-free literal transcripts of the actually dictated texts are a scarce and costly resource because they have to be created by manually transcribing the audio files. To obtain large corpora of literal transcripts for medical dictation, we propose a method for automatically reconstructing them from draft speech-recognition transcripts plus the corresponding final medical reports. The main innovative aspect of our method is the combination of two independent knowledge sources: phonetic information for the identification of speech-recognition errors and semantic information for detecting post-editing concerning format and style. Speech recognition results and final reports are first aligned, then properly matched based on semantic and phonetic similarity, and finally categorised and selectively combined into a reconstruction hypothesis. This method can be used for various applications in language technology, e.g., adaptation for ASR, document production, or generally for the development of parallel text corpora of non-literal text resources. In an experimental evaluation, which also includes an assessment of the quality of the reconstructed transcripts compared to manual transcriptions, the described method results in a relative word error rate reduction of 7.74% after retraining the standard language model with reconstructed transcripts. © 2010 Elsevier Ltd. All rights reserved.
منابع مشابه
Semantics-based Automatic Literal Reconstruction Of Dictations
This paper describes a method for the automatic literal reconstruction of dictations in the domain of medical reports. The raw output of an automatic speech recognition system and the final report edited by a professional medical transcriptionist serve as input to the reconstruction algorithm. Reconstruction is based on automatic alignment between the speech recognition result and the edited re...
متن کاملGenerating Training Data for Medical Dictations
In automatic speech recognition (ASR) enabled applications for medical dictations, corpora of literal transcriptions of speech are critical for training both speaker independent and speaker adapted acoustic models. Obtaining these transcriptions is both costly and time consuming. Non-literal transcriptions, on the other hand, are easy to obtain because they are generated in the normal course of...
متن کاملComparison of the Effectiveness of Semantic Cognitive Reconstruction Therapy and Self-Encouragement Therapy on Chronic Fatigue in People with Psychosomatic Skin
The purpose of this study was to the comparison of the effectiveness of semantic cognitive reconstruction therapy and self-encouragement therapy on chronic fatigue in people with psychosomatic skin. The research method was quasi-experimental with a pretest-posttest design with the control group. The statistical population of the study included people with psychosomatic skin who were referred to...
متن کاملIdentifying Segment Topics in Medical Dictations
In this paper, we describe the use of lexical and semantic features for topic classification in dictated medical reports. First, we employ SVM classification to assign whole reports to coarse work-type categories. Afterwards, text segments and their topic are identified in the output of automatic speech recognition. This is done by assigning work-type-specific topic labels to each word based on...
متن کاملRevealing the Structure of Medical Dictations with Conditional Random Fields
Automatic processing of medical dictations poses a significant challenge. We approach the problem by introducing a statistical framework capable of identifying types and boundaries of sections, lists and other structures occurring in a dictation, thereby gaining explicit knowledge about the function of such elements. Training data is created semiautomatically by aligning a parallel corpus of co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computer Speech & Language
دوره 25 شماره
صفحات -
تاریخ انتشار 2011